Document Classification and Visualisation to Support the Investigation of Suspected Fraud

نویسندگان

  • Johan Hagman
  • Domenico Perrotta
  • Ralf Steinberger
  • Aristide Varfis
چکیده

This position paper reports on ongoing work where three clustering and visualisation techniques for large document collections – developed at the Joint Research Centre (JRC) – are applied to textual data to support the European Commission’s investigation on suspected fraud cases. The techniques are (a) an implementation of the neural network application WEBSOM, (b) hierarchical cluster analysis, and (c) a method to present collections in two-dimensional space which is based on previous hierarchical clustering. In order to put these three techniques into their context, we describe the general design of a multilingual document retrieval, information extraction and visualisation system which is being developed at the JRC to support the Anti-Fraud Office (OLAF) of the European Commission in their fight against fraud. The description includes information on the individual components of the system, i.e. an agent to retrieve documents from the internet, a language recogniser, a tool to recognise geographical references in text, a keyword identification tool, as well as a word clustering component.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Document Classi cation and Visualisation to Support the Investigation of Suspected Fraud

This position paper reports on ongoing work where three clustering and visualisation techniques for large document collections { developed at the Joint Research Centre (JRC) { are applied to textual data to support the European Commission's investigation on suspected fraud cases. The techniques are (a) an implementation of the neural network application WEBSOM, (b) hierarchical cluster analysis...

متن کامل

Financial Reporting Fraud Detection: An Analysis of Data Mining Algorithms

In the last decade, high profile financial frauds committed by large companies in both developed and developing countries were discovered and reported. This study compares the performance of five popular statistical and machine learning models in detecting financial statement fraud. The research objects are companies which experienced both fraudulent and non-fraudulent financial statements betw...

متن کامل

Identification of Fraud in Banking Data and Financial Institutions Using Classification Algorithms

In recent years, due to the expansion of financial institutions,as well as the popularity of the World Wide Weband e-commerce, a significant increase in the volume offinancial transactions observed. In addition to the increasein turnover, a huge increase in the number of fraud by user’sabnormality is resulting in billions of dollars in lossesover the world. T...

متن کامل

Ensemble Classification and Extended Feature Selection for Credit Card Fraud Detection

Due to the rise of technology, the possibility of fraud in different areas such as banking has been increased. Credit card fraud is a crucial problem in banking and its danger is over increasing. This paper proposes an advanced data mining method, considering both feature selection and decision cost for accuracy enhancement of credit card fraud detection. After selecting the best and most effec...

متن کامل

Identification of Fraud in Banking Data and Financial Institutions Using Classification Algorithms

In recent years, due to the expansion of financial institutions,as well as the popularity of the World Wide Weband e-commerce, a significant increase in the volume offinancial transactions observed. In addition to the increasein turnover, a huge increase in the number of fraud by user’sabnormality is resulting in billions of dollars in lossesover the world. T...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2000